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Abstract 

We consider the problem of recovering of continuous multi-dimensional functions / 
from the noisy observations over the regular grid m~^7J^^ m G N,= . Our focus is at 
the adaptive estimation in the case when the function can be well recovered using 
a linear filter, which can depend on the unknown function itself. In the companion 
paper [26] we have shown in the case when there exists an adapted time-invariant fil- 
ter, which locally recovers "well" the unknown signal, there is a numerically efficient 
construction of an adaptive filter which recovers the signals "almost as well" . In the 
current paper we study the application of the proposed estimation techniques in 
the non-parametric regression setting. Namely, we propose an adaptive estimation 
procedure for "locally well-filtered" signals (some typical examples being smooth 
signals, modulated smooth signals and harmonic functions) and show that the rate 
of recovery of such signals in the ^p-norm on the grid is essentially the same as that 
rate for regular signals with nonhomogeneous smoothness. 

Key words: Nonparametric denoising, adaptive filtering, minimax estimation, 
nonparametric regression. 



1 Introduction 

Let F = (fi, S, P) be a probability space. We consider the problem of recov- 
ering unknown complex- valued random field {sr = Sr{C,))Tez'i over Z'' from 
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noisy observations 



yr = Sr + Cr- (l) 



We assume that the field (e,-) of observation noises is independent of (sr) 
and is of the form Cr = crer, where (er) are independent of each other stan- 
dard Gaussian complex-valued variables; the adjective "standard" means that 
^{er), '^{er) are independent of each other N(0, 1) random variables. 

We suppose that the observations (1) come from a function ("signal") / of 
continuous argument (which we assume to vary in the d-dimensional unit cube 

[0,1]'='); this funct ion is observed in noise along an n-point equidistant grid in 
[0, l]'^, and the problem is to recover / via these observations. This problem 
fits the framework of nonparametric regression estimation with a "traditional 
setting" as follows: 

A. The objective is to recover an unknown smooth function / : [0, l]'' R, 
which is sampled on the observation grid r„ = {xj. = m~^T : < Ti, r^^ < 
m} with {m+iy = n, so that Sj. = f{x^). The error of recovery is measured 
with some functional norm (or a semi-norm) || ■ || on [0, 1]'^, and the risk of 
recovery / of / is the expectation Ef\\f — f\\^] 

B. The estimation routines are aimed at recovering smooth signals, and their 
quality is measured by their maximal risks, the maximum being taken over / 
running through natural families of smooth signals, e.g.. Holder or Sobolev 
balls; 

C. The focus is on the asymptotic, as the volume of observations n goes to 
infinity, behavior of the estimation routines, with emphasis on asymptoti- 
cally minimax (nearly) optimal estimates - those which attain (nearly) best 
possible rates of convergence of the risks to as the observation sample size 
n — > oo. 

Initially, the research was focused on recovering smooth signals with a pri- 
ori known smoothness parameters and the estimation routines were tuned to 
these parameters (see, e.g., [23,34,38,24,2,31,39,22,36,21,27]). Later on, there 
was a significant research on adaptive estimation. Adaptive estimation meth- 
ods are free of a priori assumptions on the smoothness parameters of the 
signal to be recovered, and the primary goal is to develop the routines which 
exhibit asymptotically (nearly) optimal behavior on a wide variety of fami- 
lies of smooth functions (cf. [35,28,29,30,6,8,9,25,3,7,19]). For a more compete 
overview of results on smooth nonparametric regression estimation see, for 
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instance, 

The traditional focus on recovering smooth signals ultimately comes from 
the fact that such a signal locally can be well-approximated by a polyno- 
mial of a fixed order r, and such a polynomial is an "easy to estimate" en- 
tity. Specifically, for every integer T > 0, the value of a polynomial p at 
an observation point Xt can be recovered via (2T -|- I)'' neighboring obser- 
vations {xr '■ \Tj — tj\ < T,l < j < d} "at a parametric rate" - with 
the expected squared error Ca'^{2T + 1)""^ which is inverse proportional to 
the amount (2T + l)*^ of the observations used by the estimate. The co- 
efficient C depends solely on the order r and the dimensionality d of the 
polynomial. The corresponding estimate p{xt) of p{xt) is pretty simple: it is 
given by a "time-invariant filter" , that is, by convolution of observations with 
an appropriate discrete kernel q^'^^ = (?^^'')rez<* vanishing outside the box 
Ot = {t eZ"^: \Tj\ <T,l<j< d}: 

t&Ot 

then the estimation / of f{xt) is taken as / = p{xt)- 

Note that the kernel q^'^^ is readily given by the degree r of the approximating 
polynomial, T and dimension d. The "classical" adaptation routines takes care 
of choosing "good" values of the approximation parameters (namely, T and r). 
On the other hand, the polynomial approximation "mechanism" is supposed 
to be fixed once for ever. Thus, in those procedures the "form" of the kernel 
is considered as given in advance. 

In the companion paper [26] (referred hereafter as Part I) we have introduced 
the notion of a well-filtered signal. In brief, the signal (st)tgz<^ is T-well-filtered 
for some T G N+ if there is a filter (kernel) q = q^'^^ G Ot which recovers (Sr) 
in the box {u : \u — t\ < 3T} with the mean square error comparable with 



max E k„ - E iPvu-rl' < 0{a'T-% 

u:\u-t\<ZT I ^^Q^ 



The universe of these signals is much wider than the one of smooth signals. As 
we have seen in Part I that it contains, in particular, "modulated smooth 
signals" - sums of a fixed number of products of smooth functions and mul- 
tivariate harmonic oscillations of unknown (and arbitrarily high) frequencies. 

^ Our "brief outline" of adaptive approach to nonparametric regression would be 
severely incomplete without mentioning a novel approach aimed at recovering non- 
smooth signals possessing sparse representations in properly constructed functional 
systems [5,10,4,11,12,13,14,15,16,17,37,18]. This promising approach is completely 
beyond the scope of our paper. 
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Wc have shown in Part I that whenever a discrete time signal (that is, a signal 
defined on a regular discrete grid) is well- filtered, we can recover this signal at 
a "nearly parametric" rate without a priori knowledge of the associated filter. 
In other words, a well-filtered signal can be recovered on the observation grid 
basically as well as if it were an algebraic polynomial of a given order. 

We are about to demonstrate that the results of Part I on recovering well- 
filtered signals of unknown structure can be applied to recovering nonparamet- 
ric signals which admit well-filtered local approximations. Such an extension 
has an unavoidable price - now we cannot hope to recover the signal well out- 
side of the observation grid (a highly oscillating signal can merely vanish on 
the observation grid and be arbitrarily large outside it). As a result, in what 
follows we are interested in recovering the signals along the observation grid 
only and, consequently, replace the error measures based on the functional 
norms on [0, 1 J'' by their grid analogies. 

The estimates to be developed will be "double adaptive", that is, adaptive 
with respect to both the unknown in advance structures of well-filtered ap- 
proximations of our signals and to the unknown in advance "approximation 
rate" - the dependence between the size of a neighborhood of a point where 
the signal in question is approximated and the quality of approximation in 
this neighborhood. Note that in the case of smooth signals, this approxima- 
tion rate is exactly what is given by the smoothness parameters. The results to 
follow can be seen as extensions of the results of [32,20] (see also [33]) deahng 
with the particular case of univariate signals satisfying differential inequalities 
with unknown differential operators. 



2 Nonparametric regression problem 

We start with the formal description of the components of the nonparametric 
regression problem. 

Let for T e Z'^, |t| = max{|ri|, .... \rii\}, and let r < m for some a G N denote 
Ti < m, i = l,...,d. Let m be a positive integer, n — {m + lY, and let 
— = m~^a : a e Z"*, < a, \a\ < m}. 

Let C([0, l]'^) be the linear space of complex- valued fields over [0, 1]'^. We 
associate with a signal / e C{[0, l^) its observations along r„: 

y = y]{e) = {yr = y'^{f, e) = fim'^r) + e^, = (7er}o<r<m, (2) 

where {erjTezd are independent standard Gaussian complex-valued random 
noises. Our goal is to recover /|p from observations (2). In what follows, we 



4 



write 

/, = /(m-V), [reZ^m-Ve[0,l]'^] 

Below we use the following notations. For a set 5 C [0, 1]*^, we denote by 
Z(S) the set of all i e Z'^ such that m~^t e B. We denote || • \\q^B the standard 
Lp-norm on B: 

i/p 

\9\\p,B = I J \g{x)\Pdx 
\xeB 

and \g\q B its discrete analogy, so that 



We set 

= r„ n (o, i)" = {m-H :teZ'^,t>o, \t\ < m}. 

Let X — m~H e r°. We say that a nonempty open cube 

Bh{x) = {u I \ui - Xi\ < h/2, i = 1, d} 

centered at x is admissible for x, if Bh{x) C [0, 1]". For such a cube, Tfi{x) 
denotes the largest nonnegative integer T such that 

Z{B) D {r G Z"' : |r-t| < AT}. 

For a cube 

B = {x eR'^ : \xi -Ci\< h/2, i = 1, d}, 
D{B) = h stands for the edge of B. For 7 e (0, 1) we denote 

B^^{xeW^ : \xi - Ci\ < 7/1/2, i = 1, d} 

the 7-shrinkage of B to the center of B. 



2. 1 Classes of locally well-filtered signals 



Recall that wc say that a function on [0, 1]^ is smooth if it can be locally 
well-approximated by a polynomial. Informally, the the definition below sais 
that a continuous signal / e C([0, 1])'* is locally well-filtered if / admits a good 
local approximation by a well-filtered discrete signal 0^ on r„ (see Definition 
1 of Section 2.1, Part I). 

Definition 1 Let B C [0, l]'^ he a cube, k he a positive integer, p > 1, R > 
be reals, and let p e {d, 00] . The collection B, k, p, R, p specifies the family 
F'^'p'P(B, R) of locally well-filtered on B signals f defined by the following 
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requirements: 

(1) feC{%lY); 

(2) There exists a nonnegative function F e Lp{B), \\F\\p^B < R, such that 
for every x — m^H e r„ fl vaiB and for every admissible for x cube Bh{x) 
contained in B there exists a field (p e C(Z'^) such that (p e Slj,^(^^~^(^0, p, Th{x)) 
(where the set S\^{9,p,T) of T -well filtered signals is defined in Definition 1 
of Part I) and 

Vr e Z(i?,(a;)) : - M < h''-'"'P\\F\\p^B,(.). (3) 

In the sequel, we use for F^'P'P(^B; R) also the shortened notation Flip], where 
ip stands for the collection of "parameters" {k, p,p, B, R). 

Remark The motivating example of locally well-filtered signals is that of 
modulated smooth signals as follows. Let a cube B C [0,1]'^, p G {d,oo], 
positive integers k, v and a real i? > be given. Consider a collection of v 
functions gi,...,gi, e C([0,1]'^) which are k times continuously differentiable 
and satisfy the constraint 

E \\D''ge\\p,B < R. 

£=1 

Let u;{i) e R*^, and let 

1/ 

/(^) = J29e(^) ex.p{iu;'^{i)x}. 

e=i 

By the standard argument [1], whenever x = m~^t G H intS and Bh[x) is 
admissible for x, the Taylor polynomial '^*|(-), of order k — 1, taken at x, of fi 
satisfies the inequality 

u e Bn{x) \mu)-fi{u)\ < c^h''-'"^\\F,\\p^B,i.), where F,{u) = \D^fi{u)\ 
(here and in what follows, q arc positive constants depending solely on d, k 

V 

and v). It follows that if ^{u) = E exp{ia;^(£)M} then 

u e Bh{x) ^ mu) - f{u)\ < 

(4) 

F^C2j:Fi \\F\\p,B < CsR]. 

Now observe that the exponential polynomial 0(r) = $(m^-'^r) belongs to 
S^(0,C4,T) for any < T < L < cxd (Proposition 10 of Part I). Combining 
this fact with (4), we conclude that / G ^Kp{^,m,v[B, c(z/, fc, d)R). 
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2.2 Accuracy measures 



Let us fix 7 e (0, 1) and q e [1, cxo]. Given an estimate /„ of the restriction 
/|p^ of / on the grid r„, based on observations (2) (i.e., a Borel real- valued 
function of a; G r„ and y G C") and ip = {k, p,p, B, R), let us characterize the 
quality of the estimate on the set F['^] by the worst-case risks 



R,(/„;F[V']) = sup 



E 



/n(-;y/(6))-/lr„(- 



1/2 



3 Estimator construction 



The recovering routine we are about to build is aimed at estimating functions 
from classes F^''''^(i?, i?) with unknown in advance parameters k, p,p, B, R. 
The only design parameters of the routine is an a priori upper bound /i on the 
parameter p and a 7 G (0, 1). 



3. 1 Preliminaries 



From now on, we denote by = G(„) the deterministic function of observation 
noises defined as follows. For every cube B C [0, l]*^ with vertices in r„, we 
consider the discrete Fourier transform of the observation noises reduced to 
Br\Tn, and take the maximum of modules of the resulting Fourier coefficients, 
let it be denoted ^^(e). By definition, 

= 0(n) = cr"^ max (e), 

where the maximum is taken over all cubes B of the indicated type. By the 
origin of 0(n), due to the classical results on maxima of Gaussian processes 
(cf also Lemma 15 of Part I), we have 



\/w > 1 : Prob |0(n) > u'Vln n| < exp |- 



cw"^ In n 



(5) 



where c > depends solely on d. 
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3.2 Building blocks: window estimates 



To recover a signal / via n — m'^ observations (2), we use point- wise window 
estimates of / defined as follows. 

Let us fix a point x = m'^^t G r°; our goal is to build an estimate of f{x). 
Let Bh{x) be an admissible window for x. We associate with this window an 
estimate — f^{x;y'f{e)) of f{x) defined as follows. If the window is "very 
small", specifically, h < m~^, so that x is the only point from the observation 
grid r„ in Bh{x), we set Th{x) = and = yt. For a larger window, we 
choose the largest nonnegative integer T = Th{x) such that 

Z{Bh{x)) D{r:\r-t\< AT} 

and apply Algorithm A of Part I to build the estimate of ft = f{x), the design 
parameters of the algorithm being {ii,Th{x)). Let the resulting estimate be 
denoted by g = S(x;y7(e)). 

To characterize the quality of the estimate fj^ — f^{x; yf{e)), let us set 



Lemma 2 One has 



Assuming that h > and combining (6) with the result of Theorem 4 of 
Part I we come to the following upper bound on the error of estimating f{x) 
by the estimate f^{x;-): 



\f{x)-f::ix;yfie))\<C, 



$^(/,i?,(a;)) + ^e(„) 



(7) 



(note that (2T^(x) + 1)-'^/^ < Co{nh'^)-'^/^). For evident reasons (7) holds true 
for "very small windows" (those with h < m."^) as well. 



3.3 The adaptive estimate 



We are about to "aggregate" the window estimates into an adaptive esti- 
mate, applying Lepskii's adaptation scheme in the same fashion as in [30,19,20] . 



8 



Let us fix a "safety factor" a; in such a way that the event ©(„) > uy/lnn is 
"highly un-probable" , namely, 

Prob {e(„) > uy/^] < (8) 



by (5), the required uj may be chosen as a function of fi^d only. We are to 
describe the basic blocks of the construction of the adaptive estimate. 
"Good" realizations of noise. Let us define the set of "good realizations 
of noise" as 



S„ = {e I e(„) < a;Vhin}. (9) 
Now (7) implies the "conditional" error bound 

a ^ (10) 

Observe that as h grows, the ''deterministic term" ^^{f,Bh{x)) does not de- 
crease, while the "stochastic term" Sn{h) decreases. 

The "ideal" window. Let us define the ideal window B^{x) as the largest 
admissible window for which the stochastic term dominates the deterministic 



one: 



B^x) = Bh^^^){x), ^^^^ 
K{x) = max{/i I /i > 0, Bh{x) C [0, 1]'*, $^(/, Bh{x)) < 



Note that such a window does exist, since Sn{h) — > oo as /i — > +0. Besides 
this, since the cubes Bh{x) are open, the quantity ^^{f, Bh{x)) is continuous 
from the left, so that 

Q<h<K{x)^ $^(/, Bh{x)) < Snih). (12) 



Thus, the ideal window B^{x) is well-defined for every x possessing admissible 
windows, i.e., for every x = = {m'H : t e Z*^, < t, \t\ < m}. 
Normal windows. Assume that e e S„. Then the errors of all estimates 
f^{x;y) associated with admissible windows smaller than the ideal one are 
dominated by the corresponding stochastic terms: 

e e S„,0 < /i < K{x) ^ \f{x) - f!:{x;yf{e))\ < 2C^Sn{h) (13) 
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(by (10) and (12)). Let us fix an e G S„ (and thus - a realization y of tlie 
observations) and let us call an admissible for x window Bh{x) normal, if 
the associated estimate fn{x;y) differs from every estimate associated with a 
smaller window by no more than 4Ci times the stochastic term of the latter 
estimate, i.e. 

Window Bh{x) is normal 
t 

(14) 

Bh[x) is admissible 

W,0<h'<h: \fli'{x;y)-f!i{x;y)\<ACA{h') [y = yf{e)] 



Note that if x e r°, then x possesses a normal window, specifically, the window 
B^-i{x). Indeed, this window contains a single observation point, namely, x 
itself, so that the corresponding estimate, same as every estimate correspond- 
ing to a smaller window, by construction coincides with the observation at x, 
so that all the estimates f^' {x; y), < h' < m~^, are the same. Note also that 
(13) implies that 

(!) If e & S„, then the ideal window B^{x) is normal. 

The adaptive estimate fn{x]y). The property of an admissible window 
to be normal is "observable" - given observations we can say whether a 
given window is or is not normal. Besides this, it is clear that among all 
normal windows there exists the largest one B^{x) — Bh+(x){x)- The adaptive 
estimate fn{x;y) is exactly the window estimate associated with the window 
B~^{x). Note that from (!) it follows that 

(!!) If e e S„, then the largest normal window B'^{x) contains the ideal 
window B^{x). 

By definition of a normal window, under the premise of (!!) we have 
and we come to the conclusion as follows: 

(*) If e e then the error of the estimate fn{x; y) = {x; y) is dominated 

by the error bound (10) associated with the ideal window: 

e e =^ \fn{x;y) - f{x)\ < 5C, f$^(/, 5,,(,)(a;)) + Sn{K{x))] . (15) 



Thus, the estimate fn{-',-) ^ which is based solely on observations and does 
not require any a priori knowledge of the "parameters of well-filterability of 
/" - possesses basically the same accuracy as the "ideal" estimate associated 
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with the ideal window (provided, of course, that the reahzation of noises is 

not "pathological": e G 

Note that the adaptive estimate fni.^', u) we have built depends solely on "de- 
sign parameters" /x, 7 (recall that Ci depends on /x, 7), the volume of obser- 
vations n and the dimension d. 



4 Main result 



Our main result is as follows: 

Theorem 3 Let 7 G (0,1), /i > 1 be an integer, let F = F^'^'f (5; i?) be a 
family of locally well-filtered signals associated with a cube B C [0, l]'^ with 
mD{B) > 1; p < /X and p > d. For properly chosen P > 1 depending solely 
on II, d, p, 7 and nonincreasing in p > d the following statement holds true: 

Suppose that the volume n = mf" of observations (2) is large enough, namely, 

. 2kp+d(p-2) J? / fl 2kp+d(p-2) 

p-^n > -Jr^ > P\D(B)] ^ (16) 

cr V mn 



where D{B) is the edge of the cube B. 

Then for every q G [1, 00] the worst case, with respect to F, q-risk of the 
adaptive estimate fn{-,-) associated with the parameter /i can be bounded as 
follows: 



R, (/n;F) =sup (i? {|A(-;i//(e)) - /(O^^^J)'^' (17) 



where 



when q < 



(2k+d}p 

I -^j^, when q > 

I i i when a < IHM^ 

0, when q > 

(recall that here B^ is the concentric to B ^ times smaller cube). 
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Note that the rates of convergence to 0, as n — > oo, of the risks Rg Fj of 

our adaptive estimate on the famihes F = F''''''^(i?; R) are exactly the same 
as those stated by Theorem 3 from [31] (see also [30,9,19,33]) in the case of 
recovering non-parametric smooth regression functions from Sobolev balls. It 
is well-known that in the smooth case the latter rates are optimal in order, 
up to logarithmic in n factors. Since the families of locally well-filtered signals 
are much wider than local Sobolev balls (smooth signals are trivial examples 
of modulated smooth signals!), it follows that the rates of convergence stated 
by Theorem 3 also are nearly optimal. 



5 Simulation examples 

In this section we present the results of a small simulation study of the adap- 
tive filtering algorithm applied to the 2-dimensional de-noising problem. The 
simulation setting is as follows: we consider real- valued signals 

yT = Sr + er, r = (ti, ra) G {1, ...,m}^ 



6(1,1),..., e{m,m) being independent standard Gaussian random variables. The 
problem is to estimate, given observations (Ut), the values of the signal (fx^) 
on the grid T„i = {m^ V, 1 < ri, T2 < m}, and f{xr) — Sr- The value m — 128 
is common to all experiments. 

We consider signals which are sums of three harmonic components: 

St- — a[sm(m~^ujjT + 61) + sin(m~^a;Jr -|- 62) + sm(m~^uj^T + 9^)]; 



the frequencies cjj and the phase shifts 6i, i = 1,...,3 are drawn randomly 
from the uniform distribution over, respectively, [0, a;niax]^ and [0, 1]^ and the 
coefficient a is chosen to obtain the signal-to-noise ratio equal to one. 

In the simulations we present here we compared the result of adaptive recovery 
with T = 10 to that of a "standard nonparametric recovery", i.e. the recovery 
by the locally linear estimator with square window. We have done k = 100 
independent runs for each of eight values of a;niax, 

^max = {1-0, 2.0, 4.0, 8.016.0, 32.0, 64.0, 128.0}. 
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In Table 1 we summarize the results for the mean integrated squared error 
(MISE) of the estimation, 



100 (m,m) 

The observed phenomenon is rather expectable: for slowly oscillating signals 
the quality of the adaptive recovery is slightly worse than that of "standard 
recovery" , which are tuned for estimation of regular signals. When we rise the 
frequency of the signal components, the adaptive recovery stably outperforms 
the standard recovery. Finally, standard recovery is clearly unable to recover 
highly oscillating signals (cf Figures 1-4) . 



Appendix 

We denote C{7/') the linear space of complex- valued fields over U^. A field 
r G C{1J^) with finitely many nonzero entries Tr is called a filter. We use the 
commun notation Aj, j = 1, d, for the "basic shift operators" on C{IJ^): 

(Ajr)T-i,...,rd — '>^Tl,...,Tj-l,Tj-l,Tj + l,...,Td- 

and denote r(A)x the output of a filter r, the input to the filter being a field 
X e C(Z'^), so that {r{A)x)t = ErrXt-r- 



5. 1 Proof of Lemma 2. 

To save notation, let B — Bh{x) and T — Th{x). Let p G C(Z'^) be such 
that p e S|r(0, A*, T) and \pr - fr\ < Bh{x)) for all r G Z{Bh{x)). Since 

p G 8^3^(0, fi, T), there exists a filter q G Ct(Z'^) such that \q\2 < fJ,{2T+ 1)-'^/^ 
and {q{A)p)r = Pr whenever |t — i| < 3T. Setting 5r — fr — Pt, we have for 
any t , \t — t\ < 3T, 

\fr - {q{A)fU < \6r\ + \Pr " {q{A)p)r\ + MA)6)r\ 

< $^(/, Bh{x)) + max{|5,| :\u-r\<T}< $^(/, Bh{x)) 
+ \q\2{2T + lY/^^^{f,Bh{x))max{\5,\ : |z/-r| <T} 
[note that |t — t| < 3T and — t| < T implies — t| < 4T] 



MISE = 

\ 
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as required in (6). □ 



5.2 Proof of Theorem 3 



In the main body of the proof, we focus on the case p,q < oo; the case of 
infinite p and/or q will be considered at the concluding step 5*^. 
Let us fix a family of well-filtered signals F = F^''''^(i?; R) with the parameters 
satisfying the premise of Theorem 3 and a function / from this class. 
Recall that by the definition of F there exists a function -F > 0, < R, 

such that for all x = m~H G (inti?) fl and all h, B^^x) C B: 

Bu{x)) < P^h'-^'^nU, Bn{x)), n{f, B') = (^FP{u)di)j . (18) 



Prom now on, P (perhaps with sub- or superscripts) are quantities > 1 de- 
pending on II, d, 7, p only and nonincreasing in p > d. 



1°. We need the following auxihary result: 
Lemma 4 Assume that 

^ Vh^ > Pi(/x + 3)^-'^/^'+'^/2 — . (19) 



n 



Given a point a; e r„ n let us choose the largest h = h{x) such that 

(a):h< (l--f)D(B), 

(6) : P,h''-''/m{f,Bh{x))<Sn{h). 

Then h{x) is well-defined and 

h{x) > m-^. (21) 

Besides this, the error at x of the adaptive estimate fn as applied to f can he 
hounded as follows: 

\U{x;y) - f{x)\ < C, [Sr,{h{x))l{e e + (70(„)l{e ^ Sj] (22) 

Proof: The quantity h{x) is well-defined, since for small positive h the left 
hand side in (20.6) is close to 0, while the right hand side one is large. Prom 
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(19) it follows that h — m ^ satisfies (20. a), so that S^-i(a:) C B. Moreover, 
(19.6) implies that 

the latter inequality, in view of fl{f, Bj^-i (x)) < R, says that h = wT^ satisfies 
(20.6) as well. Thus, h{x) > m~^, as claimed in (21). 

Consider the window Bh{x){x)- By (20. a) it is admissible for while from 
(20.6) combined with (18) we get $^(/, i?/j(a;)(x)) < Sn{h). It follows that the 
ideal window B^{x) of x is not smaller than Bh(x){x). 

Assume that e e 5„. Then, according to (15), we have 

\fn{x;y)-f{x)\ < 5C, [<^^{f, Bh^^,){x)) + Sn{h,{x))] . (23) 

Now, by the definition of an ideal window, ^/^{f, Bh^(^x){x)) < 5'„(/i*(a;)), and 
the right hand side in (23) does not exceed 10Ci5'„(/i*(a;)) < 10CiSn{h{x)) 
(recall that, as we have seen, h^{x) > h{x)), as required in (22). 

Now let e ^ S„. Note that fn{x; y) is certain estimate f^{x] y) associated with 
a centered at x and admissible for x cube Bh{x) which is normal and such that 
h > (the latter - since the window B^-i{x) always is normal, and Bh{x) 
is the largest normal window centered at x). Applying (14) with h' = (so 
that f^{x-y) = f{x) + aet), we get \{f{x) + aet) - fn{x]y)\ < 4CiSn{m-^), 
whence 



\f{x) - fn{x;y)\ < a\et\ + iCiSn{m-') < aO^n) + ^C^auV]^ < C2&^n) 

(recall that we are in the situation e ^ S„, whence uy/^nn < 0(„)). We have 
arrived at (22). □ 

Now we are ready to complete the proof. Assume that (19) takes place, and 
let us fix g, ^Mi^p < q < oo. 

2°. Let us denote ct„ = '^\f^- Note that for every x e r„ n i?^ either 

h{x)^{l-^)D{B), 

or 

2p 



what means that 

P^h!'-'"P{x)n{f,Bh(x){,x)) = Sn{,h{x)). (24) 

Let t/, V be the sets of those x G -B" = p[ B^ for which the first or, 
respectively, the second of this possibihties takes place. If V is nonempty, let 
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us partition it as follows. 

1) We can choose Xi E V {V is finite!) such that h{x) > h{xi) Vx G V. After 
Xi is chosen, we set Vi = {x E V \ B^^){x) fl Bh(^xi){xi) ^ 0}. 

2) If the set V\Vi is nonempty, we apply the construction from 1) to this 
set, thus getting X2 G V\Vi such that h{x) > h{x2) Vx e V\yi, and set 
V2^{xe V\Vi I Bh^^){x) n Bhi^,){x2) ^ 0}. If the set V\{Vi U V2) still is 
nonempty, we apply the same construction to this set, thus getting x^ and V3, 
and so on. 

The outlined process clearly terminates after certain step (since V is finite). 
On termination, we get a collection of M points Xi, xm £ V and a partition 
y = Vi U 1^2 U ... U Vm with the following properties: 

(1) The cubes Bh(^^){xi), Bh(^j^){xM) are mutually disjoint; 

(2) For every £ < M and every x E Vi we have h{x) > h{x(,) and B^^){x) fl 

Bh(xe){xe) ^ 0. 
We claim that also 

(3) For every t <M and every x eVi one has 

/i(a;) > max [/i(a;£); ||a; — x^lloo] • (25) 

Indeed, hix) > h{xi) by (ii), so that it suffices to verify (25) in the case when 
11^ ~ xe.\\oo ^ h{xi). Since Bfi(x){x) intersects Bh(^x^){xi), we have 

\\x - xeWoo < ^{h{x) + h{xi)). 

Whence 

h{x) > 2\\x - xeWoo - h{x() > \\x - x^||oo, 
which is what we need. 

3°. Let us set = r„ fl S^. Assume that e e S„. When substituting h{x) — 
(1 - -f)[D{B)] for xEU,we have by (22): 



xeu e=i xeVi 



elm 9^ 



x&U 



M 

[by (25)] <C|a>-|^ ^ (max[/i(x,), \\x - x,||oo])"^ + Clal[D{B)f^ 
<CtKll I {^^AKx^)^\x-X(.\U])-'^dx + Clal[D{B)]^ 



16 



< / (max [h{xi),r])-^ dr + ClalD[D{B)] 



d{2-q) 
2 



due to h{x) > m ^, see (21). Further, note that 

-^-d+l> — — p -d+l> d^/2 + 1 



m 



view of g > ^^^p, A; > 1 and p > d, and 



M 



M 



[by (24]] =C|a^E 
^=1 



d{2-q} 



2k-2d/p+d ^(2-9) 

+ c|?^[^(^)]^ 



P1Q(/,S;,(,,)(X,)) 



M 



d(g-2) 
2fe-2d/p+d 



by definition of f3{p, k, d, q). 

Now note that 2k-2dfp+d — P view of g > ^^j^p, so that 



M 



d(q-2) 
2k-2d/p+d ^ 



■ M 



dq-2d 
p(2k-2d/p+d) 



d{q-2) 

< [ PlEF] P(2k-2d/p+d) 



(see (18) and take into account that the cubes Bh(^xi){xi), £ — 1,...,M, are 
mutually disjoint by (i)). We conclude that for e e S„ 



/n(-;Z//(e))-/(-) ^ <Cra,p(5)]'^+P2af(^'''''^)i?^^^ 



:C7a„p(i?)]^^ + P2i?(^^j 

4°. Now assume that e ^ S„. In this case, by (22), 

|/„(x;y)-/(x)| <C2ae(„) vxep;^. 
Hence, taking into account that mD{B) > 1, 



/n(-;y)-/(-) ^ <C2<je(„)[L'(s)]?. 



(26) 



(27) 
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5°. When combining (26) and (27), we get 

1/2 



(£^{ll/n(-;y)-/(-)llU}) ^C-smax 

where 



a„[L>(S)]^^;P4/2(-|j ; J 



1/2 



we have used (5) and (8)). Thus, when (19) holds, for all d < p < oo and all 

d 



Qi ^^j^p < g < oo we have 



{E{\\U-,y)-f{-)\\lB^f' 



< Cg max 



(28) 



Now it is easily seen that if P > 1 is a properly chosen function of /i, d, 7,p 
nonincreasing in p > d and (16) takes place then 

(1) assumption (19) holds, 

(2) the right hand side in (28) does not exceed the quantity 

PR ^ PR [£,(5)]ciA(p,fe,d,5) 

(recall that q > ^^p, so that A(p, k, d, g) = 0). 

We conclude the bound (17) for the case of(i<p<oo,oo>g> ^^j^p. When 
passing to the limit as g — > oo, we get the desired bound for g = oo as well. 

Now let d < p < oo and 1 < q < q* = ^^^p. By the Holder inequality and in 
view of mD{B) > 1 we have 



1 1 



and thus 

R, (/„;F) < CioR,. (/n;F) [D{B)]<-^-t). 

Combining this observation with the (already proved) bound (17) associated 
with q = q^, we see that (17) is valid for all q G [l,oo], provided that d < 
p < oo. Passing in the resulting bound to limit as p — > cxo, we conclude the 
validity of (17) for all p G (ci, oo], g e [1, oo]. □ 
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Standard recovery 



Adaptive recovery 



Fig. 2. Recovery for Wmax = 8.0 
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Standard recovery 



Fig. 3. Recovery for 



32.0 
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Fig. 4. Recovery for Wmax — 

128.0 
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